Current focus: CLEAN and dirty images for commonly observed targets
1300 crops, taken every 200 px (x and y), allowing crops with < 30% NaN
Beam size can be different between basic and enhanced image product, and coordinate shifts for alignment can be present in the enhanced images as well. Crop coordinates determined from enhanced images, and the corresponding crop is taken from the basic image product, with slight adjustments if the resulting images is not 256x256
23000 crops (paired images)
what parts of a scientific image do the various attention heads highlight/segment as impoortant?
Example with random crop from MIGHTEE
Example with random crop from MGCLS
Example with random crop from MIGHTEE
Example with random crop from MGCLS
Example with random crop from MIGHTEE
Example with random crop from MIGHTEE
'99% of variance explained with 11 components (ViT) and >500 components (resnet)'
'99% of variance explained with 373 components (pre-trained) and 33 components (fine-tuned)'
FAISS query of MIGHTEE on embeddings from pre-trained ViT

FAISS query of MIGHTEE on embeddings from pre-trained resnet

FAISS query of MGCLS on embeddings from pre-trained resnet

FAISS query of MGCLS on embeddings from fine-tuned resnet

Pre-trained ViTs look promising, from the way the attention heads seem to already select sources. However, fine-tuning with the default set of hyperparameters does not result in the loss decreasing (no matter the dataset size). Is this simply a matter of the correct hyperparameters, or is there something else to consider?